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Abstract — We derive quantum counterparts of two key the- 
orems of classical information theory, namely, the rate distor- 
tion theorem and the source-channel separation theorem. The 
rate-distortion theorem gives the ultimate limits on lossy data 
compression, and the source-channel separation theorem implies 
that a two-stage protocol consisting of compression and channel 
coding is optimal for transmitting a memoryless source over 
a memoryless channel. In spite of their importance in the 
classical domain, there has been surprisingly little work in these 
areas for quantum information theory. In the present paper, 
we prove that the quantum rate distortion function is given in 
terms of the regularized entanglement of purification. We also 
determine a single-letter expression for the entanglement-assisted 
quantum rate distortion function, and we prove that it serves 
as a lower bound on the unassisted quantum rate distortion 
function. This implies that the unassisted quantum rate distortion 
function is non-negative and generally not equal to the coherent 
information between the source and distorted output (in spite 
of Barnum's conjecture that the coherent information would 
be relevant here). Moreover, we prove several quantum source- 
channel separation theorems. The strongest of these are in the 
entanglement-assisted setting, in which we establish a necessary 
and sufficient codition for transmitting a memoryless source over 
a memoryless quantum channel up to a given distortion. 

Index Terms — quantum rate distortion, reverse Shannon the- 
orem, quantum Shannon theory, quantum data compression, 
source-channel separation 



I. Introduction 

Two pillars of classical information theory are Shannon's 
data compression theorem and his channel capacity theorem 
l48ll , l2T1l . The former gives a fundamental limit to the 
compressibility of classical information, while the latter deter- 
mines the ultimate limit on classical communication rates over 
a noisy classical channel. Modern communication systems 
exploit these ideas in order to make the best possible use of 
communication resources. 

Data compression is possible due to statistical redundancy in 
the information emitted by sources, with some signals being 
emitted more frequently than others. Exploiting this redun- 
dancy suitably allows one to compress data without losing 
essential information. If the data which is recovered after 
the compression-decompression process is an exact replica 
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of the original data, then the compression is said to be 
lossless. The simplest example of an information source is 
a memoryless one. Such a source can be characterized by a 
random variable U with probability distribution {pu(u)} and 
each use of the source results in a letter u being emitted with 
probability pu(u). Shannon's noiseless coding theorem states 
that the entropy H (U) = — J2 U pu (u) log 2 Pu (u) of such 
an information source is the minimum rate at which we can 
compress signals emitted by it |48], |21 ]. 

The requirement of a data compression scheme being loss- 
less is often too stringent a condition, in particular for the 
case of multimedia data, i.e., audio, video and still images 
or in scenarios where insufficient storage space is available. 
Typically a substantial amount of data can be discarded before 
the information is sufficiently degraded to be noticeable. 
A data compression scheme is said to be lossy when the 
decompressed data is not required to be identical to the original 
one, but instead recovering a reasonably good approximation 
of the original data is considered to be good enough. 

The theory of lossy data compression, which is also referred 
to as rate distortion theory, was developed by Shannon l49ll 
ifTTIl . |2T1 . This theory deals with the tradeoff between the 
rate of data compression and the allowed distortion. Shannon 
proved that, for a given memoryless information source and 
a distortion measure, there is a function R(D), called the 
rate-distortion function, such that, if the maximum allowed 
distortion is D then the best possible compression rate is given 
by R(D). He established that this rate-distortion function is 
equal to the minimum of the mutual information I(U;U) := 
H (U) + H(U) — H(U,U) over all possible stochastic maps 
Pfj^j(u\u) that meet the distortion requirement on average: 

R(D)= min I(U\U). (1) 

p(u\u) : E{d(Ufi)}<D 

In the above d(U,U) denotes a suitably chosen distortion 
measure between the random variable U characterizing the 
source and the random variable U characterizing the output 
of the stochastic map. 

Whenever the distortion D = 0, the above rate-distortion 
function is equal to the entropy of the source. If D > 0, then 
the rate-distortion function is less than the entropy, implying 
that fewer bits are needed to transmit the source if we allow 
for some distortion in its reconstruction. 

Alongside these developments, Shannon also contributed the 
theory of reliable communication of classical data over clas- 
sical channels l48lL l2fl . His noisy channel coding theorem 
gives an explicit expression for the capacity of a memoryless 
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classical channel, i.e., the maximum rate of reliable communi- 
cation through it. A memoryless channel M is one for which 
there is no correlation in the noise acting on successive inputs, 
and it can be modelled by a stochastic map A/" = py\x (y\ x )- 
Shannon proved that the capacity of such a channel is given 
by 

C (AT) = max / (X; Y) . 

Px(x) 

Any scheme for error correction typically requires the use of 
redundancy in the transmitted data, so that the receiver can 
perfectly distinguish the received signals from one another in 
the limit of many uses of the channel. 

Given all of the above results, we might wonder whether 
it is possible to transmit an information source U reliably 
over a noisy channel A/", such that the output of the infor- 
mation source is recoverable with an error probability that is 
asymptotically small in the limit of a large number of outputs 
of the information source and uses of the noisy channel. An 
immediate corollary of Shannon's noiseless and noisy channel 
coding theorems is that reliable transmission of the source 
is possible if the entropy of the source is smaller than the 
capacity of the channel: 

H(U) <C(M). (2) 

The scheme to demonstrate sufficiency of ^ is for the sender 
to take the length n output of the information source, compress 
it down to nH (U) bits, and encode these nH (U) bits into a 
length n sequence for transmission over the channel. As long 
as H (U) < C (A/"), Shannon's noisy channel coding theorem 
guarantees that it is possible to transmit the nH (U) bits over 
the channel reliably such that the receiver can decode them, 
and Shannon's noiseless coding theorem guarantees that the 
decoded nH (U) bits can be decompressed reliably as well in 
order to recover the original length n output of the information 
source (all of this is in the limit as n —> oo). Given that 
the condition in §2§ is sufficient for reliable communication 
of the information source, is it also necessary? Shannon's 
source -channel separation theorem answers this question in 
the affirmative RSI. |2H. 

The most important implication of the source-channel sepa- 
ration theorem is that we can consider the design of compres- 
sion codes and channel codes separately — a two- stage encod- 
ing method is just as good as any other method, whenever the 
source and channel are memoryless. Thus we should consider 
data compression and error correction as independent prob- 
lems, and try to design the best compression scheme and the 
best error correction scheme. The source-channel separation 
theorem guarantees that this two- stage encoding and decoding 
with the best data compression and error correction codes will 
be optimal. 

Now what if the entropy of the source is greater than the 
capacity of the channel? Our best hope in this scenario is to 
allow for some distortion in the output of the source such 
that the rate of compression is smaller than the entropy of 
the source. Recall that whenever D > 0, the rate-distortion 
function R (D) is less than the entropy H (U) of the source. In 
this case, we have a variation of the source-channel separation 
theorem which states that the condition R(D) < C (A/") is 



both necessary and sufficient for the reliable transmission of 
an information source over a noisy channel, up to some amount 
of distortion D (2D . Thus, we can consider the problems of 
lossy data compression and channel coding separately, and the 
two- stage concatenation of the best lossy compression code 
with the best channel code is optimal. 

Considering the importance of all of the above theorems 
for classical information theory, it is clear that theorems in 
this spirit would be just as important for quantum information 
theory. Note, however, that in the quantum domain, there 
are many different information processing tasks, depending 
on which type of information we are trying to transmit and 
which resources are available to assist the transmission. For 
example, we could transmit classical or quantum data over a 
quantum channel, and such a transmission might be assisted 
by entanglement shared between sender and receiver before 
communication begins. 

There have been many important advances in the above 
directions (some of which are summarized in the recent 
text (561). Schumacher proved the noiseless quantum coding 
theorem, demonstrating that the von Neumann entropy of 
a quantum information source is the ultimate limit to the 
compressibility of information emitted by it (44). Hayashi 
et al. have also considered many ways to compress quantum 
information, a summary of which is available in Ref. [29]. 

Quantum rate distortion theory, that is the theory of lossy 
quantum data compression, was introduced by Barnum in 
1998. He considered a symbol- wise entanglement fidelity as 
a distortion measure [4] and, with respect to it, defined the 
quantum rate distortion function as the minimum rate of data 
compression, for any given distortion. He derived a lower 
bound on the quantum rate distortion function, in terms of 
well-known entropic quantity, namely the coherent informa- 
tion. The latter can be viewed as one quantum analogue of 
mutual information, since it is known to characterize the 
quantum capacity of a channel (371 , (511 , (23), just as the 
mutual information characterizes the capacity of a classical 
channel. It is this analogy, and the fact that the classical rate 
distortion function is given in terms of the mutual information, 
that led Barnum to consider the coherent information as a 
candidate for the rate distortion function in the quantum realm. 
He also conjectured that this lower bound would be achievable. 

Since Barnum 's paper, there have been a few papers in 
which the problem of quantum rate distortion has either been 
addressed (25), [20], or mentioned in other contexts (59), (30), 
(39), (38) . However, not much progress has been made in 
proving or disproving his conjecture. In fact, in the absence of 
a matching upper bound, it is even unclear how good Barnum 's 
bound is, given that the coherent information can be negative, 
as was pointed out in [25], [20]. 

There are also a plethora of results on information trans- 
mission over quantum channels. Holevo (3D , Schumacher, 
and Westmoreland [47] provided a characterization of the 
classical capacity of a quantum channel. Lloyd [37], Shor 
(2D, and Devetak [23] proved that the coherent information 
of a quantum channel is an achievable rate for quantum 
communication over that channel, building on prior work of 
Nielsen and coworkers [46], (45), (6), (5) who showed that 
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its regularization is an upper bound on the quantum capacity 
(note that the coherent information of a quantum channel 
is always non-negative because it involves a maximization 
over all inputs to the channel). Bennett et al. proved that 
the mutual information of a quantum channel is equal to 
its entanglement-assisted classical capacity ifTOl (the capacity 
whenever the sender and receiver are given a large amount of 
shared entanglement before communication begins). 

In Ref. (TO) , the authors also introduced the idea of a reverse 
Shannon theorem, in which a sender and receiver simulate a 
noisy channel with as few noiseless resources as possible (later 
papers rigorously proved several quantum reverse Shannon 
theorems (T), (T2) . (8l). Although such a task might initially 
seem unmotivated, they used a particular reverse Shannon 
theorem to establish a strong converse for the entanglement- 
assisted classical capacity^] Interestingly, the reverse Shannon 
theorems can also find application in rate distortion theory 
(59), (30), (39) . (38) . and as such, they are relevant for our 
purposes here. 

In this paper, we prove several important quantum rate 
distortion theorems and quantum source-channel separation 
theorems. Our first result in quantum rate distortion is a 
complete characterization of the rate distortion function in an 
entanglement-assisted setting]^] This result really only makes 
sense in the communication paradigm (and not in a storage 
setting), where we give the sender and receiver shared en- 
tanglement before communication begins, in addition to the 
uses of the noiseless qubit channel. The idea here is for 
a sender to exploit the shared entanglement and a minimal 
amount of classical or quantum communication in order for 
the receiver to recover the output of the quantum information 
source up to some distortion. Our main result is a single-letter 
formula for the entanglement-assisted rate distortion function, 
expressed in terms of a minimization of the input-output 
mutual information over all quantum operations that meet the 
distortion constraint. This result implies that the computation 
of the entanglement-assisted rate distortion function for any 
quantum information source is a tractable convex optimization 
program. It is often the case in quantum Shannon theory 
that the entanglement-assisted formulas end up being formally 
analogous to Shannon's classical formulas (TO) . (28), and our 
result here is no exception to this trend. 

We next consider perhaps the most natural setting for quan- 
tum rate distortion in which a compressor tries to compress 
a quantum information source so that a decompressor can 
recover it up to some distortion D (this setting is the same as 
Barnum's in Ref. (4)). This setting is most natural whenever 
sufficient quantum storage is not available, but we can equiva- 
lently phrase it in a communication paradigm, where a sender 
has access to many uses of a noiseless qubit channel and would 
like to minimize the use of this resource while transmitting 
a quantum information source up to some distortion. We 

! A strong converse demonstrates that the error probability asymptotically 
approaches one if the rate of communication is larger than capacity. This is in 
contrast to a weak converse, which only demonstrates that the error probability 
is bounded away from zero under the same conditions. 

2 One might consider these entanglement-assisted rate distortion results to 
be part of the "quantum reverse Shannon theorem folklore," but Ref. |8 | does 
not specifically discuss this topic. 



prove that the quantum rate distortion function is given in 
terms of a regularized entanglement of purification (54) in 
this case. In spite of our characterization being an intractable, 
regularized formula, our result at the very least shows that 
the quantum rate distortion function is always non-negative, 
demonstrating that Barnum's conjecture from Ref. (4) does not 
hold since his proposed rate-distortion function can become 
negative. Furthermore, we prove that the entanglement-assisted 
quantum rate distortion function is a single-letter lower bound 
on the unassisted quantum rate distortion function (one might 
suspect that this should hold because additional resources 
such as shared entanglement should only be able to improve 
compression rates). This bound implies that the coherent 
information between the source and distorted output is not 
relevant for unassisted quantum rate distortion, in spite of 
Barnum's conjecture that it would be. 

We finally prove three source-channel separation theorems 
that apply to the transmission of a classical source over a 
quantum channel, the transmission of a quantum source over 
a quantum channel, and the transmission of a quantum source 
over an entanglement-assisted quantum channel, respectively. 
The first two source-channel separation theorems are single- 
letter, in the sense that they do not involve any regularised 
quantities, whenever the Holevo capacity or the coherent 
information of the channel are additive, respectively. The third 
theorem is single-letter in all cases because the entanglement- 
assisted quantum capacity is given by a single-letter expression 
for all quantum channels (2), (TO) . We also prove a related 
set of source-channel separation theorems that allow for some 
distortion in the reconstruction of the output of the information 
source. From these theorems we infer that it is best to search 
for the best quantum data compression protocols (16), (13), 
(9) . (3) . (41), (42) . the best quantum error-correcting codes 
(50) . (19) . (T8l (40) . (43) . £35), and the best entanglement- 
assisted quantum error-correcting codes (TP . (32) . (35) . (57) 
independently of each other whenever the source and channel 
are memoryless. The theorems then guarantee that combining 
these protocols in a two-stage encoding and decoding is 
optimal. 

We structure this paper as follows. We first overview rel- 
evant notation and definitions in the next section. Section Jill 
introduces the information processing task relevant for quan- 
tum rate distortion and then presents all of our quantum rate 
distortion results in detail. Section |IV] presents our various 



quantum source-channel separation theorems for memoryless 
sources and channels. Finally, we conclude in Section [V] and 
discuss important open questions. 

II. Notation and Definitions 

Let 1-L denote a finite-dimensional Hilbert space and let 
V(H) denote the set of density matrices or states (i.e., positive 
operators of unit trace) acting on H. Let pA £ T>{%a) denote 
the state characterizing a memoryless quantum information 
source, the subscript A being used to denote the underlying 
quantum system. We refer to it as the source state. Let 
I^ra) ^ ® 7~La denote its purification, that is, 
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is a pure state density matrix of a larger composite system 
RA, such that its restriction on the system A is given by pa, 
i.e. pa '•= Tr^^^ A , with Tr# denoting the partial trace over 
the Hilbert space Hr of a purifying reference system R. The 
pure state \^ P RA ) is entangled if p is a mixed state. The von 
Neumann entropy of pA, and hence of the source, is defined 
as 

H(A) p = -Tr{p\ogp}. (3) 

The quantum mutual information of a bipartite state uoab is 
defined as 

/ (A; B) u = H (A) u + H (B) u - H (AB) U . 

The coherent information I(A)B) a of a bipartite state gab is 
defined as follows: 

I(A)B) a :=H(B) a -H(AB) a . (4) 

In quantum information theory, the most general mathemat- 
ical description of any allowed physical operation is given by 
a completely positive trace-preserving (CPTP) map, which is a 
map between states. We let id^ denote the trivial (or identity) 
CPTP map which keeps the state of a quantum system A 
unchanged, and we let AT = jV A ^ B denote the CPTP map 

N A ^ B :V(Ua)^V(U b ). 

The entanglement of purification of a bipartite state uoab is 
a measure of correlations (54), having an operational interpre- 
tation as the entanglement cost of creating ujab asymptotically 
from ebits, while consuming a negligible amount of classical 
communication. It is equivalent to the following expression: 

E P (ujab) = minH ((ids <g> N e )(v>be(uj))) , 

Ne 

where Pbe(uj) = Tr a{</)abe}> ^abe i s some purification 
of ujab, and the minimization is over all CPTP maps Me 
acting on the system E. (The original definition in Ref. l54l is 
different from the above, but one can check that the definition 
given here is equivalent to the one given there.) 

In this paper we make use of resource inequalities (see 
e.g., (261), to express information-processing tasks as inter- 
conversions between resources. Let [c — )> c] denote one for- 
ward use of a noiseless classical bit channel, [q — )> q] one 
forward use of a noiseless qubit channel, and [qq] one ebit 
of shared entanglement (a Bell state). A simple example of a 
resource inequality is entanglement distribution: 

[q-> q}> [qq] , 

meaning that Alice can consume one noiseless qubit channel in 
order to generate one ebit between her and Bob. Teleportation 
is a more interesting way in which all three resources interact 
□ 

2 [c -> c] + [qq] > [q q] . 

The above resource inequalities are finite and exact, but we 
can also express quantum Shannon theoretic protocols as 
resource inequalities. For example, the resource inequality 
for the protocol achieving the entanglement- as sis ted classical 
capacity of a quantum channel is as follows: 



The meaning of the above resource inequality is that there 
exists a protocol exploiting n uses of a memoryless quantum 
channel AT and nH (A) ebits in order to transmit nl (A; B) 
classical bits from sender to receiver. The resource inequality 
becomes exact in the asymptotic limit n —> oo because it is 
possible to show that the error probability of decoding these 
classical bits correctly approaches zero as n —> oo ifTOl . 

III. Quantum Rate-Distortion 

A. The Information Processing Task 

The objective of any quantum rate distortion protocol is to 
compress a quantum information source such that the decom- 
pressor can reconstruct the original state up to some distortion. 
Like Barnum (4), we consider the following distortion measure 
d(p,Af) for a state pA £ V{Ha) with purification \^ P RA ) an d 
a quantum operation jV = jV a ^ b \ 

d(pM) = l-F e (p,N), (5) 
where F e is the entanglement fidelity of the map jV\ 

F e { P ,N) = (r RA \(ttR®^ B )(r RA w RA ). (6) 

The entanglement fidelity is not only a natural distortion 
measure, but it also possesses several analytical properties 
which prove useful in our analysis. 

The state p n := (pa)® 71 £ V(Hj^) characterizes n succes- 
sive outputs of a memoryless quantum information source. A 
source coding (or compression-decompression) scheme of rate 
R is defined by a block code, which consists of two quantum 
operations — the encoding and decoding maps. The encoding 
En is a map from n copies of the source space to a subspace 
H Q n c Uf 1 of dimension 2 nR : 

E n :V(UT)^V(H Qn ) : 

and the decoding V n is a map from the compressed subspace 
to an output Hilbert space Hf 1 : 

V n :V(U Qn )^V(UT). 

The average distortion resulting from this compression- 
decompression scheme is defined as l4l : 

n 1 

d(p,v n oE n ) = y2-d( P T^), 

i=i 

where T n % ^ is the "marginal operation" on the z-th copy of the 
source space induced by the overall operation T n = V n o£ n , 
and is defined as 

J#Hp) = Tr AuA2 ,.., Ai _ uAi+1 ,.., An [T n (P® n )}- (7) 

The quantum operations V n and £ n define an (n,R) quantum 
rate distortion code. 

For any R,D > 0, the pair (R,D) is said to be an 
achievable rate distortion pair if there exists a sequence of 
(n,R) quantum rate distortion codes (S nj V n ) such that 

lim d(p,V n oS n ) < D. (8) 

n—too 

The quantum rate distortion function is then defined as 



(TV) + H (A) [qq]>I{A;B) [c -> c] 



R q (D) = mf{R : (R, D) is achievable}. 
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(b) 

Fig. 1. The most general protocols for (a) unassisted and (b) assisted quantum 
rate distortion coding. In (a), Alice acts on the tensor power output of the 
quantum information source with a compression encoding S. She sends the 
compressed qubits over noiseless quantum channels (labeled by "id") to Bob, 
who then performs a decompression map T> to recover the quantum data that 
Alice sent. In (b), the task is similar, though this time we assume that Alice 
and Bob share entanglement before communication begins. 

In the communication model, if the sender and receiver have 
unlimited prior shared entanglement at their disposal, then 
the corresponding quantum rate distortion function is denoted 
as Rlz C (D) or i2f aq (Z}), depending on whether the noiseless 
channel between the sender and the receiver is classical or 
quantum. Figure [T] depicts the most general protocols for 
unassisted and assisted quantum rate distortion coding. 

B. Reverse Shannon Theorems and Quantum Rate -Distortion 
Coding 

Before we begin with our main results, we first prove 
Lemma [T] below. This lemma is similar in spirit to Lemma 26 
of Ref. EH and Theorem 19 of Ref. EH, and like them, 
it shows that to generate a rate-distortion code, it suffices 
to simulate the action of a noisy channel on a source state 
such that the resulting output state meets the desired distortion 
criterion. Unlike them, however, it is specifically tailored to 
the entanglement fidelity distortion measure. 

Lemma 1: Fix e > and < D < 1. Consider a state p A 
with purification \^ P RA ) and a quantum channel J\f = J\f A ^ B 



for which d(p,N) < D. Let 

urb '= (id(g)Af) i\) p RA . 

Furthermore, let {Tn) n denote a sequence of quantum oper- 
ations such that for n large enough, 

||<t*»b» -^fSlli <e, (9) 

where 

<tb»b» := (id/?-® T n ) ((i> p RA )® n ) ■ 

Then for n large enough, the average distortion under the 
quantum operation T n satisfies the bound 

d(p,F n ) <D + e, 

Proof: Expressing R n = RiR 2 ---R n , and B n = 
B1B2 • • • B ni we have for any 1 < i < n, 

cr RiBi = (idi,0^)(^ A ). (10) 

By monotonicity of the trace distance under partial trace, we 
have that 

\\0-RiBi ~ ^RBWx < \\0-Rn B n ~ Uf2||l • (11) 

Hence, the average distortion under the quantum operation T n 
is given by 

d(p,Fn) = lit(l-F e (p,T^)) 

i=l 
1 71 

= -E( 1 -(^I^bJ^)). (12) 

i=l 

Recall the following inequality from Ref. iTTsTl : 

TtP(A -B)> Tr(A - B)-, (13) 

where < P < I is any positive operator and (A — B)_ 
denotes the negative spectral part of the operator (A — B). 
We then have the following inequalities: 

= {Vra\urb\Vra) + rFr (V RA ^RiB i -urb)) 

> F e {p,N) + TY{a RiBi - ljrb)-, (14) 

where the inequality follows from ( [T3] ) and the definition of 
entanglement fidelity: 

Hence, from ( fT2| ), ( fT4| ) and fTT] ), we have 

1 n 

n 

i=i 
1 n 

U i=l 

< d(p,Af) + \\0-Rn B n -UJ R n B A\l 

<D + s, (15) 
which concludes the proof of the lemma. ■ 
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The above lemma illustrates a fundamental connection 
between quantum reverse Shannon theorems and quantum 
rate-distortion protocols. In particular, if a reverse Shannon 
theorem is available in a given context, then it immediately 
leads to a rate-distortion protocol. This is done simply by 
choosing the simulated channel to be the one which, when 
acting on the source state, yields an output state which meets 
the distortion criterion for the desired rate-distortion task. This 
is our approach in all of the quantum rate-distortion theorems 
that follow, and it was also the approach in Refs. |59], (38). 

There is, however, one caveat with the above approach. 
The reverse Shannon theorems often require extra correlated 
resources such as shared randomness or shared entanglement 
ifTOl , ifTli . 151 , ifTZlh and the demands of a reverse Shannon 
theorem are much more stringent than those of a rate-distortion 
protocol. A reverse Shannon theorem requires the simulation 
of a channel to be asymptotically exact, whereas a rate- 
distortion protocol only demands that a source be recon- 
structed up to some average distortion constraint. The differ- 
ences in these goals can impact resulting rates if sufficient 
correlated resources are not available l22l . 

In the entanglement-assisted setting considered in the next 
subsection, the assumption is that an unlimited supply of 
entanglement is available, and thus the entanglement-assisted 
quantum reverse Shannon theorem suffices for producing a 
good entanglement-assisted rate-distortion protocol. In the 
unassisted setting, no correlation is available, and exploiting 
the unassisted reverse Shannon theorem leads to rates that are 
possibly larger than necessary for the task of quantum rate 
distortion. Nevertheless, we still employ this approach and dis- 
cuss the ramifications further in the forthcoming subsections. 

C. Entanglement- Assisted Rate -Distortion Coding 

1) Rate-Distortion with noiseless classical communica- 
tion: The quantum rate distortion function, B% &C (D), for 
entanglement-assisted lossy source coding with noiseless clas- 
sical communication, is given by the following theorem. 

Theorem 2: For a memoryless quantum information source 
defined by the density matrix pa>, with a purification \^ p AA i), 
and any given distortion < D < 1, the quantum rate dis- 
tortion function for entanglement-assisted lossy source coding 
with noiseless classical communication, is given by 



R q 



(D) = min / (A; B) t 



(16) 



where M M A '^ B denotes a CPTP map, 

rA'^B\ 



AA' 



and / (A; B) u denotes the mutual information. 

Proof: We first prove the converse (optimality). Consider 
the most general protocol for entanglement-assisted lossy 
source coding that acts on many copies (p® n ) of the state 
p G V(1-La) (depicted in Figure [TJb)). We take a purification 
of p as \^ P RA )- Let <&t a t b denote an entangled state, with the 
system Ta being with Alice and the system T B being with 
Bob. Alice then acts on the state p® n and her share Ta of 
the entangled state with a compression map S n = £ ATITa ~ 



where W is a classical system of size « 2 nr , with r being 
the rate of compression (in Figure [TJb), W corresponds to 
the outputs of the noiseless quantum channels). Then Bob 
acts on both the classical system W that he receives and 
his share T B of the entangled state with the decoding map 
V n = V WTB ^ Bn . The final state should be such that it is 
distorted by at most D according to the average distortion 
criterion in the limit n —t oo ([8]). With these steps in mind, 
consider the following chain of inequalities: 

nr>H (W) 
>H(W\T B ) 

> H (W\T B ) - H (W\R n T B ) 

= I(W;R n \T B ) 

= I(W;R n \T B )+I(R n ;T B ) 

= I(WT B ;R n ) 

>I(B n ;R n ). 

The first inequality follows because the entropy nr of the 
uniform distribution is the largest that the entropy H (W) 
can be. The second inequality follows because conditioning 
cannot increase entropy. The third inequality follows because 
H (W\R n T B ) > from the assumption that W is classical. 
The first equality follows from the definition of mutual infor- 
mation, and the second equality follows from the fact that R n 
and T B are in a product state. The third equality is the chain 
rule for quantum mutual information. The final inequality is 
from quantum data processing. Continuing, we have 

n 
n 

= n Y.\ Rq ^( d {p^^ ) )) 



>nR^ c (D). 



(17) 



In the above, is the marginal operation on the i-th copy 
of the source space induced by the overall operation T n = 
V n oE n , and is given by ([7]). The first inequality follows from 
superadditivity of quantum mutual information (see Lemma 15 
in the appendix). The second inequality follows from the fact 
that the map T>i o Si has distortion d(p,T>i o Si) and the 
information rate-distortion function is the minimum of the 
mutual information over all maps with this distortion. The last 
two inequalities follow from convexity of the quantum rate- 
distortion function R^ ac (D), (see Lemma 14 in the appendix), 
from the assumption that the average distortion of the protocol 
is no larger than the amount allowed: 



n 

J2-d{p,Vio£J<D, 

L — ' n 
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and from the fact that Rl^c (D), is non-increasing as a function 



of D (see Lemma 14 in the appendix). 

The direct part of Theorem [2] follows from the quantum 
reverse Shannon theorem, which states that it is possible to 
simulate (asymptotically perfectly) the action of a quantum 
channel J\f on an arbitrary state p, by exploiting noiseless clas- 
sical communication and prior shared entanglement between a 
sender and receiver [10], Q, (8), [Q2). The resource inequality 
for this protocol is 

/ (A; B) u [c^c]+H [B) u [qq] > (Af : p) , (18) 

where the entropies are with respect to a state of the following 
form: 

\uabe) = U#^ BE W AA ,), 



l^AA') * s a purification of p, U J 



A'^BE 



is an isometric ex- 



tension of the channel M A ^ B . Our protocol simply exploits 
this theorem. More specifically, for a given distortion D, we 
take J\f to be the CPTP map which achieves the minimum 
in the expression (16) of Rl& c (D). Then we exploit classical 



communication at the rate given in the resource inequality ( [T8] ) 
to simulate the action of the channel AT on the source state 
p. For any arbitrarily small e > and n large enough, the 
protocol for the quantum reverse Shannon theorem simulates 
the action of the channel up to the constant e (in the sense 
of ([9])). This allows us to invoke Lemma [T] to show that the 
resulting average distortion is no larger than D + e. ■ 

The main reason that we can use the quantum reverse 
Shannon theorem as a "black box" for the purpose of quantum 
rate distortion is from our assumption of unlimited shared 
entanglement. It is likely that this protocol uses much more 
entanglement than necessary for the purpose of entanglement- 
assisted quantum rate distortion coding with classical channels, 
and it should be worthwhile to study the trade-off between 
classical communication and entanglement consumption in 
more detail, as previous authors have done in the context of 
channel coding 1521. 155!, t34i 155!. Such a study might lead 
to a better protocol for entanglement-assisted rate distortion 
coding and might further illuminate better protocols for other 
quantum rate distortion tasks. 

We think that our protocol exploits more entanglement than 
necessary from considering what is known in the classical 
case regarding reverse Shannon theorems and rate-distortion 
coding |2T! . fTOl , l22l . First, as reviewed in ([I]), the classical 
mutual information minimized over all stochastic maps that 
meet the distortion criterion is equal to Shannon's classical 
rate-distortion function PT! . Bennett et al. have shown that the 
classical mutual information is also equal to the minimum rate 
needed to simulate a classical channel whenever free common 
randomness is available ifTOl . Thus, a simple strategy for 
achieving the task of rate distortion is for the parties to choose 
the stochastic map that minimizes the rate distortion function 
and simulate it with the classical reverse Shannon theorem. 
But this strategy uses far more classical bits than necessary 
whenever sufficient common randomness is not available l22l . 
Meanwhile, we already know that the mutual information is 
achievable without any common randomness if the goal is rate 
distortion lHH . 



2) Rate-Distortion with noiseless quantum communica- 
tion: The quantum rate distortion function, i2f aq (Z}), for 
entanglement-assisted lossy source coding with noiseless 
quantum communication, is given by the following theorem. 

Theorem 3: For a memoryless quantum information source 
defined by the density matrix p A * 9 with a purification \ijJ AA /), 
and any given distortion < D < 1, the quantum rate dis- 
tortion function for entanglement-assisted lossy source coding 
with noiseless quantum communication, is given by 



min I (A: B) 

N : d(p,Af)<D /U 



where M M A '^ B denotes a CPTP map, 



uab 



(id A ®N A '^ B )^ p 



AA> )•> 



(19) 



(20) 



and I (A; B) u denotes its mutual information. 

Proof: We first prove the converse (optimality). The setup 
is similar to that in the converse proof of Theorem [2j with 
the exception that W is now a quantum system and we let 
E denote the environment of the compressor. Consider the 
following chain of inequalities: 

2nr > 2H (W) 

= H(W)+H (R n T B E) 

>H(W)+H (R n T B E) - H (WR n T B E) 

= I (W; R n T B E) 

>I(W-R n T B ) 

= I (WT B ; R n ) + I (W; T B ) - I (R n ; T B ) 
= I(WT B ;R n )+I(W;T B ) 
>I(WT B -R n ) 

>I(B n ;R n ). (21) 

The first inequality is because the entropy nr of the uniform 
distribution is the largest that the entropy H (W) can be. 
The first equality follows from the fact that the state on 
systems WR n T B E is pure. The second inequality follows by 
subtracting the positive quantity H (WR n T B E). The second 
equality is from the definition of quantum mutual information. 
The third inequality is from quantum data processing (tracing 
over system E). The third equality is a useful identity for 
quantum mutual information. The fourth equality follows from 
I (R n ] T B ) = since R n and T B are in a product state. The 
second-to-last inequality is from I (W;T B ) > 0, and the final 
inequality is from the quantum data processing inequality. The 



rest of the proof proceeds as in (17). 

The direct part follows from a variant of the quantum 
reverse Shannon theorem known as the fully quantum reverse 
Shannon theorem (FQRS) (H, l24l . This theorem states that 
it is possible to simulate (asymptotically perfectly) the action 
of a channel M on an arbitrary state p, by exploiting noise- 
less quantum communication and prior shared entanglement 
between a sender and receiver. It has the following resource 
inequality: 



1 



1 



I (A; B) u [q^q] + -I (5; E) u [qq] > (Af : p) , (22) 
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where the entropies are with respect to a state of the following 
form: 



\u A BE) = U^ BE \r AA ,), 



A'^BE 



(23) 



is an isometric 



l^AA') * s a purification of p, and U J 

extension of the channel M A ~^ B . Our protocol exploits this 
theorem as follows. For a given distortion D, take J\f to be 



the map which realizes the minimum in the expression ( 19 ) of 
RcaqiD). Then we exploit quantum communication at the rate 
given in the resource inequality ( [22] ) to simulate the action of 
the channel M on the source state p. For any arbitrarily small 
e > and n large enough, the protocol for the fully quantum 
reverse Shannon theorem simulates the action of the channel 
up to the constant e (in the sense of ([9])). This allows us to 
invoke Lemma [T] to show that the resulting average distortion 
is no larger than D + e. ■ 

We could have determined that the form of the 
entanglement-assisted quantum rate distortion function 
Reaq(D) in Theorem [5] follows easily from Theorem [5] 
by combining with teleportation. Though, the above proof 
serves an important alternate purpose. A careful inspection 
of it reveals that the steps detailed in $2A) for bounding the 
quantum communication rate still hold even if the system Tb 
is trivial (in the case where there is no shared entanglement 
between the sender and receiver before communication 
begins). Thus, we obtain as a corollary that the entanglement- 
assisted quantum rate distortion function is a single-letter 
lower bound on the unassisted quantum rate distortion 
function. This makes sense operationally as well because the 
additional resource of shared entanglement should only be 
able to improve a rate distortion protocol. 

Corollary 4: The entanglement- as sis ted quantum rate dis- 
tortion function Rl% q (D) in Theorem [5] bounds the unassisted 
quantum rate distortion function R q (D) from below: 

R" (D) > i? e % (D) . 

The above corollary firmly asserts that the coherent infor- 



mation / (A) B) of the state in (20) is not relevant for quantum 



rate distortion, in spite of Barnum's conjecture that it would 
play a role f4). That is, one might think that there should be 
some simple fix of Barnum's conjecture, say, by conjecturing 
that the quantum rate distortion function would instead be 
max {0, / (A)B)}. The above lower bound asserts that this 
cannot be the case because half the mutual information is never 
smaller than the coherent information: 



l -I{A-B)>\l{A-B) 



1 



I(A;E)=I(A)B). 



D. Unassisted Quantum Rate-Distortion Coding 

The quantum rate distortion function R q (D) for unassisted 
lossy source coding is given by the following theorem. 

Theorem 5: For a memoryless quantum information source 
defined by the density matrix p^, and any given distortion 
< D < 1, the quantum rate distortion function is given by, 

1 



R q (D) = lim 



— mm 

OO k Af( fe ) : 



E p {p® k ,N (k) ) 



(24) 



where JVW : V{U® k ) ->• V{U% k ) is a CPTP map, and 

E p {p,N) =E p (uj rb ) (25) 
denotes the entanglement of purification, with 

Lo RB = {\A R ®N A ^ B ){r RA )- (26) 

Like its classical counterpart, lossy data compression in- 
cludes lossless compression as a special case. If the distortion 
D is set equal to zero in ( |24| ), then the state oorb becomes 
identical to the state ^ P RA . Equivalently, the quantum operation 
J\f is given by the identity map id^. Since the entanglement 
of purification is additive for tensor power states l54ll : 



M&ra) 



RAJ 



nS(p A 



we infer that, for D = 0, R q (D) reduces to the von Neumann 
entropy of the source, which is known to be the optimal rate 
for lossless quantum data compression l44l . 

To prove the achievability part of Theorem |5j we can simply 
exploit Schumacher compression [44 ] (which is a special type 
of reverse Shannon theorem). Alice feeds each output A of the 



source into a CPTP map J\f that saturates the bound in (24) 
(for now, we do not consider the limit and set k = 1). This 
leads to a state of the form in ( |26| ), to which Alice can then 
apply Schumacher compression. This protocol is equivalent to 
the following resource inequality: 



H(B) U [q^q)>(N:p). 



(27) 



We note that this is a simple form of an unassisted quantum 
reverse Shannon theorem. 

Now, a subtle detail of the simulation idea is that we are 
interested in simulating the channel J\f A ^ B from Alice to 
Bob, and Alice can actually simulate an isometric extension 
Uj^ BE of the channel where Alice receives the system E 
and just traces over it. 

Though, instead of simulating Uj^ BE ', we could consider 
Alice to simulate the isometry jjA-^be b e a \ 0CdL \\y^ Schu- 
macher compressing the subsystems B and Eb so that Bob 
can recover them, while the subsystem Ea remains with Alice. 
This leads to the following protocol for unassisted simulation: 

H(BE B ) U [q -> q] > (M : p) . 

The best protocol for unassisted channel simulation is there- 
fore the one with the minimum rate of quantum communi- 
cation, the minimum being taken over all possible isometries 
V : E — » EaEb- This rate can only be less than the rate 
of quantum communication required for the original naive 



protocol in ( [27] ) since the latter is a special case in the 
minimization. This is the form of the unassisted quantum 
reverse Shannon theorem given in Ref. 0. 

One could then execute the above protocol by blocking k 
of the states together and by having the distortion channel be 
of the form AfW : A k — )► B^ k \ acting on each block of k 
states. By letting k become large, such a protocol leads to the 
following rate for unassisted communication: 



Qmin^AO = lim j min 



H (b^Eb) • (28) 
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The above quantity is equal to the entanglement of purification 
of the state (id R <8> J\f Ak ^ B(kU 

1 



'm P RA ) m ) 0: 



lim f min H (b w E b ) 

k^oo k V:E( k )^E A E B V / 

= lim 



nnn H(A E(k) ^ B *((U£ k 



k A E( k )^E 
1 



(pT)))) 



lim yE p ((id Rk ®JV A ^ B 

k^oo k 



(k) 



K(i> p R A r k ))- 



We are now in a position to prove Theorem [5] 

Proof of Theorem [5f Fix the map M such that the 
minimization on the RHS of ( [24] ) is achieved. The quantum 
reverse Shannon theorem (in this case, Schumacher compres- 
sion) states that it is possible to simulate such a channel 
M acting on p with the amount of quantum communication 
equal to E p (urb). Since the protocol simulates the channel 
up to some arbitrarily small positive e, the distortion is no 
larger than D + e by invoking Lemma [T] This establishes that 
R q (D) > E p (ujrb)- We can have a regularization as above to 
obtain the expression in the statement of the theorem. 

The converse part of the theorem can be proved as follows. 
Figure [TJa) depicts the most general protocol for unassisted 
quantum rate-distortion coding. Let E 1 denote the environment 
of the encoder, and let E2 denote the environment of the 
decoder, while W again denotes the outputs of the noiseless 
quantum channels labeled by "id." For any rate distortion code 
(£(n) 5 £,(n)) of mte r sat i s f y i ng d(p,VW o £W) < D, we 

have 

nr > H(W) 
= H(E 2 B n ) u 

> min if((id B « <g> A El E 2 )(^B n E 1 E 2 )) 

f^E 1 E< 2 

= E p ((i& Rn ®{V^o£^))){r RA r n ) 

- P ((id fl »®^">)(^)® n ). 

(29) 



> _ min Er t 



The first inequality follows because the entropy of the max- 
imally mixed state is larger than the entropy of any state on 
system W. The first equality follows because the isometric 
extension of the decoder maps W isometrically to the systems 
E 2 and B n . The second inequality follows because the entropy 
minimized over all CPTP maps on systems E\ and E 2 can 
only be smaller than the entropy on E 2 B n (the identity map 
on E 2 and partial trace of E\ is a CPTP map included in the 
minimization). The second equality follows from the definition 
of entanglement of purification. The third inequality follows 
by minimizing the entanglement of purification over all maps 
that satisfy the distortion criterion (recall that we assume our 
protocol satisfies this distortion criterion). ■ 
Our characterization of the unassisted quantum rate distor- 
tion task is unfortunately up to a regularization. It is likely 
that this regularized formula is blurring a better quantum 
rate-distortion formula, as has sometimes been the case in 
quantum Shannon theory (60). This is due in part to our 
exploitation of the unassisted reverse Shannon theorem for 
the task of quantum rate distortion, and the fact that the 



goal of a reverse Shannon theorem is stronger than that of 
a rate distortion protocol, while no correlated resources are 
available in this particular setting (see the previous discussion 
after Theorem [2}. It would be ideal to demonstrate that the 
regularization is not necessary, but it is not clear yet how 
to do so without a better way to realize unassisted quantum 
rate distortion. Nevertheless, the above theorem at the very 
least disproves Barnum's conjecture because we have demon- 
strated that the quantum rate distortion function is always 
positive (due to the fact that entanglement of purification 
is positive (54)), whereas Barnum's rate distortion function 
can become negative^] Furthermore, Corollary [4] provides a 
good single-letter, non-negative lower bound on the unassisted 
quantum rate distortion function, which is never smaller than 
Barnum's bound in terms of the coherent information. 

IV. Source-Channel Separation Theorems 

This last section of our paper consists of five important 
quantum source-channel separation theorems. The first two 
theorems apply whenever a sender wishes to transmit a mem- 
oryless classical source over a memoryless quantum channel, 
whereas the third applies when the information source to be 
transmitted is a quantum source. The second theorem deals 
with the situation in which some distortion is allowed in the 
transmission. All these three theorems are expressed in terms 
of single-letter formulas whenever the corresponding capacity 
formulas are single-letter. 

The last two theorems correspond to the cases in which a 
quantum source is sent over an entanglement-assisted quantum 
channel, with and without distortion. The formulas in these 
are always single-letter, demonstrating that it is again the 
entanglement-assisted formulas which are in formal analogy 
with Shannon's classical formulas. 

A. Shannon's source -channel separation theorem for quantum 
channels 

Shannon's original source-channel separation theorem ap- 
plies to the transmission of a classical information source over 
a classical channel. Despite the importance of this theorem, 
it does not take into account that the carriers of information 
are essentially quantum-mechanical. So our first theorem is a 
restatement of Shannon's source-channel separation theorem 
for the case in which a classical information source is to be 
reliably transmitted over a quantum channel. 

Figure [2] depicts the scenario to which this first source- 
channel separation theorem applies. The most general protocol 
for sending the output of a classical information source over 
a quantum channel consists of three steps: encoding, trans- 
mission, and decoding. The sender first takes the outputs U n 
of the classical information source and encodes them with 

3 To see that Barnum's proposed distortion function can become negative, 
consider the case of a maximally mixed qubit source, whose purification is 
the maximally entangled Bell state. Suppose that we allow the distortion to 
be as large as 3/4. Then a particular map satisfying the distortion criterion 
is the completely depolarizing map because it produces a tensor product of 
maximally mixed qubits, whose entanglement fidelity with the maximally 
entangled state is equal to 1/4. The coherent information of a tensor product 
of maximally mixed qubits is equal to its minimum value of —1. 
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Fig. 2. The most general protocol for transmitting a classical information 
source over a memoryless quantum channel. 



some CPTP encoding map £ un ^ An ? where the systems A n 
are the inputs to many uses of a noisy quantum channel 
J\f A ^ B . The sender then transmits the systems A n over 
the quantum channels, and the receiver obtains the outputs 
B n . The receiver finally performs some CPTP decoding map 
recover the random variables U n (note that this 
decoding is effectively a POVM because the output systems 
are classical). If the scheme is any good for transmitting the 
source, then the following condition holds for any given e > 0, 
for sufficiently large n: 



Pr 



< e. 



(30) 



Theorem 6: The following condition is necessary and suf- 
ficient for transmitting the output of a memoryless classical 
information source, characterized by a random variable U, 



over a memoryless quantum channel AT 
additive Holevo capacity: 

H(U)<x*W, 

where 

X* (A0^max/(J;BL 



■A'->B 



with 



(31) 



J>x (x)\x) (x\ X ®M A ^ B (p*). 



Proof: Sufficiency of (31) is a direct consequence of 
Shannon compression and Holevo- Schumacher- Westmoreland 
(HSW) coding. The sender first compresses the information 
source down to a set of size ~ 2 nH ( u \ The sender then em- 
ploys an HSW code to transmit any message in the compressed 
set over n uses of the quantum channel. Reliability of the 
scheme follows from the assumption that H (U) < x* (A/*), 
the HSW coding theorem, and Shannon compression. 

Necessity of pT) follows from reasoning similar to that in 
the proof of the classical source-channel separation theorem 
l2Tll . Fix e > 0. We begin by assuming that there exists a good 



chain of inequalities: 

nH(U) = H(U n ) 

= I(U n ;U n ) + H(U n \U n ) 

< I(U n - U n ) + 1 + Fr{U n + U n }n log \U\ 

<I(U n ;B n ) + l + sn\og\U\ 

<X* (A/" 0n ) +l + snlog|/7| 

= nx*(.A0 + l + ewlog|£/|. (32) 

The first equality follows from the assumption that the classi- 
cal information source is memoryless. The second equality is 
a simple identity. The first inequality follows from applying 
Fano's inequality. The second inequality follows from the 
quantum data processing inequality and the assumption that 
(30) holds. The third inequality follows because I(U n ;B n ) 
must be smaller than the maximum of this quantity over all 
classical-quantum states that can serve as an input to the 
tensor power channel J\f® n . The final equality follows from 
the assumption that the Holevo capacity is additive for the 
particular channel J\f. Thus, any protocol that reliably trans- 
mits the information source U should satisfy the following 
inequality 

H(U)< X * (A/*) + (l/n + £log|C/|), 

which converges to ( [3T] ) as n — » oo and e —> 0. ■ 

Remark 7: If the Holevo capacity is not additive for the 
channel, then the best statement of the source-channel separa- 
tion theorem is in terms of the regularized quantity: 



H{U)<xl & W, 



where 



n^oo n 



but it is unclear how useful such a statement is because 
we cannot compute such a regularized quantity. (The above 
statement follows by applying all of the inequalities in the 
proof of Theorem [6] except the last one.) 

What if the condition H (U) > x* (AT) holds instead? We 
can prove a variant of the above source-channel separation the- 
orem that allows for the information source to be reconstructed 
at the receiving end up to some distortion D. We obtain the 
following theorem: 

Theorem 8: The following condition is necessary and suf- 
ficient for transmitting the output of a memoryless classical 
information source over a quantum channel with additive 
Holevo capacity (up to some distortion D): 



R(D)< X * (AT): 



(33) 



scheme that meets the criterion in (30). Consider the following 



where R(D) is defined in dTJ. 

Proof: Sufficiency of ( [33) follows from the rate distortion 
protocol and the HSW coding theorem. Specifically, the sender 
compresses the information source down to a set of size 
2nR{D) an( j fa en uses an code to transmit any element of 
this set. The reconstructed sequence U n at the receiving end 
obeys the distortion constraint E{d(U, U)} < D, with d(U, U) 
denoting a suitably defined distortion measure. 
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Necessity of ([33]) follows from the fact that 



nR(D) < I(U n ;U n ), (34) 
and by applying the last four steps in the chain of inequalities 



in ([32]). A proof of (34) is available in (10.61-10.71) of 
Ref. ED. ■ 

B. Quantum source-channel separation theorem 

We now prove a source-channel separation theorem 
which is perhaps more interesting for quantum comput- 
ing/communication applications. Suppose that a sender would 
like to transmit a quantum information source faithfully over a 
quantum channel, such that the receiver perfectly recovers the 
transmitted quantum source in the limit of many copies of the 
source and uses of the channel. Figure [3] depicts the scenario to 
which our second source-channel separation theorem applies. 

As before, we characterize a memoryless quantum informa- 
tion source by a density matrix pA £ V(Ha), and consider 
I^ra) ^ 7~L R ® ^ A denote i ts purification. The entropy of 
the source H(A) P is given by (§. Let M A '^ B denote a 
memoryless quantum channel. Suppose Alice has access to 
multiple uses of the source, and she and Bob are allowed 
multiple uses of the quantum channel. 

Since Alice needs to act on many copies of the state p, 
we instead suppose that she is acting on the A systems of 
the tensor power state IV^a)^- The most general protocol is 
one in which Alice performs some CPTP encoding map £ n = 
gA n ^A fn on ^ S y S t ems f the state IV^a)^™' producing 
some output systems A ,n which can serve as input to many 
uses of the quantum channel Af A ~^ B . Alice then transmits 
the A ,n systems over the channels, leading to some output 
systems B n for the Bob. Bob then acts on these systems with 
some decoding map V n = j) Bn ^ An . If the protocol is any 
good for transmitting the quantum information source, then the 
following condition should hold for any e > and sufficiently 
large n: 



{r RA f n - V n (Af® n (£ n ((r RA f n ))) i < e. (35) 

The relation between trace distance and entanglement fi- 
delity l56l implies that 

F c (p® n ,A n )>l-e, (36) 

where A n is the composite map A n = V n o J\f® n o £ n . 

We can now state our first variant of a quantum source- 
channel separation theorem. 

Theorem 9: The following condition is necessary and suf- 
ficient for transmitting the output of a memoryless quantum 
information source, characterized by a density matrix pa, over 
a quantum channel Af = Af A with additive coherent 
information: 



H(A)<Q(M), 



(37) 



where H (A) is the entropy of the quantum information 
source, and Q (Af) is the coherent information of the chan- 
nel Af: 



'(■AO 



&AB 



. max I(A)B) C 

\$AA>) 




Fig. 3. The most general protocol for transmitting a quantum information 
source over a memoryless quantum channel. 



Proof: Sufficiency of (37) follows from Schumacher 
compression and the direct part of the quantum capacity 
theorem l37l , 1511 , |23l . Specifically, the sender compresses 
the source down to a space of dimension « 2 nH ( R ^ with 
the Schumacher compression protocol. She then encodes this 
subspace with a quantum error correction code for the channel 



Af. The condition in (37) guarantees that we can apply the 
direct part of the quantum capacity theorem, and combined 
with achievability of Schumacher compression, the receiver 
can recover the quantum information source with asymptoti- 
cally small error in the limit of many copies of the source and 
many uses of the quantum channel. 

Fix e > and note that H(A) P = H(R)^ since if p RA is a 
pure state. Then the necessity of ( [37] ) follows from the chain 
of inequalities given below. Note that the subscripts denoting 
the states have been omitted for simplicity: 

nH (A) = nH (R) 
= H(R n ) 

< I (R n )B n ) + 2 + 4 (1 - F e ) log \R n \ 
<I(R n )B n M) + 2 + 4.snlog\R\ 

= M 1 ( Rn ) Bn ) Prn + 2 + Aen lo g 1^1 



<Q(A/" 0n )+2 + 4£:nlog|i?| 
= nQ (Af) + 2 + 4enlog \R\ . 



(38) 



A'^tBi 



» A A' 



The first equality follows from the assumption that the initial 
state I^jia)^ 71 ^ s a tensor power state. The first inequality 
follows from (7.34) of Ref. (6) a fundamental relation between 
the input entropy, the coherent information of a channel, and 
the entanglement fidelity of any quantum error correction code. 

Now, the encoding that Alice employs may in general be 
some CPTP encoding map (and not an isometry). However, 
Alice can simulate any such CPTP map by first performing an 
isometry and then a von Neumann measurement on the system 
not fed into the channel (the environment of the simulated 
CPTP). Let M denote the classical system resulting from 
measuring the environment of the simulated CPTP map. We 
can write the state after the channel acts as a classical-quantum 
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state of the following form: 



^p(ra) \m) (m\ M ®pR n 



Then the second inequality follows from quantum data pro- 
cessing inequality and ( [36] ). The second equality follows 
because 

I (R n )B n M) = I (R n )B n \M) 

= Y,P(rn)I(R n )B n ) pm , 

m 

whenever the conditioning system is classical 1 56 1 . The third 
inequality follows because the channel's coherent information 
is never smaller than any individual I (R n )B n ) pm (and thus 
never smaller than the average). The final inequality follows 
from the assumption that the channel has additive coherent 
information (this holds for degradable quantum channels l27l 
and is suspected to hold for two-Pauli channels (53)). Thus, 
any protocol that reliably transmits the quantum information 
source should satisfy the following inequality 

H(R)<Q (AO + (2/n + 4elog \R\) , 

which converges to ( [37] ) as n — » oo and e —> 0. ■ 

Remark 10: A similar comment as in Remark [7] holds 
whenever it is not known that the channel has additive coherent 
information. 



C. Entanglement- assisted quantum source -channel separation 
theorem 

Our final source-channel separation theorem applies to the 
scenario where Alice and Bob have unlimited prior shared 
entanglement. The statement of this theorem is that the en- 
tropy of the quantum information source being less than the 
entanglement-assisted quantum capacity of the channel |[TQl , 
l26ll , l56l is both a necessary and sufficient condition for 
the faithful transmission of the source over an entanglement- 
assisted quantum channel. This theorem is the most powerful 
of any of the above because the formulas involved are all 
single-letter, for any memoryless source and channel. 

Figure [4] depicts the scenario to which this last theorem 
applies. The situation is nearly identical to that of the previous 
section, with the exception that Alice and Bob have unlimited 
prior shared entanglement. Alice begins by performing some 
CPTP encoding map £ n = £ AnT ^^ A n on the systems A n 
from the quantum information source and on her share Ta of 
the entanglement, producing some output systems A' n which 
can serve as input to many uses of a quantum channel N A . 
Alice then transmits the A ,n systems over the channels, 
leading to some output systems B n for Bob. Bob then acts 
on these systems and his share T B of the entanglement with 
some decoding map V n = j) BnTB ^ An . If the protocol is any 
good for transmitting the quantum information source, then the 
following condition should hold for any e > and sufficiently 
large n: 



p \®n 
RAJ 



<£, 




Fig. 4. The most general protocol for transmitting a quantum information 
source over a memoryless, entanglement-assisted quantum channel. 



where $> TaTb is the entangled state that they share before 
communication begins (it does not necessarily need to be 
maximally entangled). This leads to our final source-channel 
separation theorem: 

Theorem 11: The following condition is necessary and suf- 
ficient for transmitting the output of a memoryless quantum 
information source, characterized by a density matrix pa, over 
any entanglement-assisted quantum channel J\f = J\f A ~^ B : 



H(A)<-I(Af).. 



(40) 



where H (A) is the entropy of the quantum information 
source, and 



/(AO 



& AB 



max J(A;fl) a , 

Waa>) 



■ Af- 



A'^B 



(v 



AA> 



Proof: Sufficiency of ( |4Q| ) follows from reasoning similar 
to that in the proof of Theorem [9] We just exploit Schumacher 
compression and the entanglement-assisted quantum capacity 
theorem d, EH, EH. 



Fix e > and note that H(A) P = H(R)^ since i^ p RA is a 



pure state. Then necessity of ( |4Q| ) follows from the following 
chain of inequalities. Once again, the subscripts denoting the 
states have been omitted for simplicity: 

2nH(R) = 2H(R n ) 

<H(R n )+I(R n )B n T B ) 

+ 2 + 4(l-F e )log|iT| 

< I (R n ; B n T B ) + 2 + 4ne log \R\ 

= I (R n T B ; B n ) + I (R n ; T B ) - I (T B ; B n ) 

+ 2 + 4ne \og\R\ 
= I (R n T B ; B n ) - I (T B ; B n ) + 2 + 4ne log \R\ 

< I (R n T B M; B n ) + 2 + Ane log \R\ 



< max I(AX;B n ) + 2 + 4ne log 

PxAA' n 

= I (Af® n ) 



(39) 



nI(N) 



+ 2 + 4ne \og\R\ 
2 + 4nelog|i?| . 



(41) 
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The first inequality follows by applying the same reasoning 
as the first inequality in ( [38] ). The second inequality follows 
by applying H (R n ) + I (R n )B n T B ) = I(R n ]B n T B ) and 
the fact that 1 — F e < e for a protocol satisfying (39). 
The second inequality follows from a useful identity for 
quantum mutual information. The third equality follows from 
the assumption that systems R n and T# begin in a product 
state. The third inequality follows because I (Tg; B n ) > 0. 
The fourth inequality follows from the reasoning, similar to 
that used in the proof of Theorem [9] that Alice simulates 
an isometry and measures the environment (also exploiting 
the quantum data processing inequality). The next inequality 
follows because the state on R n TBMB n is a state of the form 



p x (x) \x) (x\ x ®Af- 



■A' n 



where we identify R u Tb with A, and M with X. Thus, 
the information quantity I (R u TbM; B n ) can never be larger 
than the maximum over all such states of that form. The 
second- to-last equality was proved in Refs. |58], l56l . The 
final equality follows from additivity of the channel's quantum 
mutual information Q, fTOlL l56l . Thus, any entanglement- 
assisted protocol that reliably transmits the quantum informa- 
tion source should satisfy the following inequality 



H(A) p <-I(AT) 



(l/n + 2elog|i?|) : 



which converges to ( |4Q| ) as n — » oo and e —> 0. ■ 
What if the condition H (A) > \l (M) holds instead? We 
can prove a variant of the above source-channel separation the- 
orem that allows for the information source to be reconstructed 
at the receiving end up to some distortion D. We obtain the 
following theorem: 

Theorem 12: The following condition is necessary and suf- 
ficient for transmitting the output of a memoryless quantum 
information source over an entanglement-assisted quantum 
channel (up to some distortion D)\ 



RU(D)< l -i(N), 



(42) 



where R^q (D) is defined (l9| ). 

Proof: Sufficiency of (|42|) follows from the entanglement- 
assisted rate distortion protocol from Theorem [3] and the 
entanglement-assisted quantum capacity theorem [10], l26l . 
That is, the sender compresses the information source down 
to a space of size 2 nR ^ D ^ and then uses an entanglement- 
assisted quantum code to transmit any state in this subspace. 
The reconstructed state at the receiving end obeys the distor- 
tion constraint. 



Necessity of ( 42 ) follows from the fact that 



nRl^{D)<-I{R^A"l 



(43) 



by applying the quantum data processing inequality to get 
I(R n ;A n ) < I(R n ;B n T B ), and finally by applying the last 
seven steps in the chain of inequalities in pT) . A proof of (43 ) 
is available in (17) of the proof of Theorem [2] ■ 



V. Conclusion 

We have proved several quantum rate-distortion theorems 
and quantum source-channel separation theorems. All of our 
quantum rate-distortion protocols employ the quantum reverse 
Shannon theorems (TO), O, El, 0, fl2. This strategy 
works out well whenever unlimited entanglement is avail- 
able, but it clearly leads to undesirable regularized formu- 
las in the unassisted setting. Our quantum source-channel 
separation theorems demonstrate in many cases that a two- 
stage compression-channel-coding strategy works best for 
memoryless sources and for quantum channels with additive 
capacity measures. Again, our most satisfying result is in 
the entanglement-assisted setting, where the pleasing result is 
that the entanglement-assisted rate distortion function being 
less than the entanglement-assisted quantum capacity is both 
necessary and sufficient for transmission of a source over a 
channel up to some distortion. 

The most important open question going forward from here 
is to determine better protocols for quantum rate distortion that 
do not rely on the reverse Shannon theorems. The differing 
goals of a reverse Shannon theorem and a rate distortion 
protocol are what lead to complications with regularization 
in Theorem [3 

Another productive avenue could be to explore scenarios 
where the unassisted quantum source-channel separation the- 
orem does not apply. In the classical case, it is known that 
certain sources and channels without a memoryless structure 
can violate the source-channel separation theorem l55lL and 
similar ideas would possibly demonstrate a violation for the 
quantum case. Though, in the quantum case, it very well 
could be that certain memoryless sources and channels could 
violate source-channel separation, but we would need a better 
understanding of quantum capacity in the general case in order 
to determine definitively whether this could be so. 

Other interesting questions are as follows: Does the 
entanglement-assisted quantum source-channel separation the- 
orem apply if sender and receiver are given unlimited access 
to a quantum feedback channel, given what we already know 
about quantum feedback fT4l ? Can anything learned from 
source-channel separation for classical broadcast or wiretap 
channels be applied to figure out a more general characteriza- 
tion for quantum channels that are not degradable? 
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anonymous referees for helpful suggestions. ND and MHH 
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ment number 213681. MMW acknowledges financial support 
from the MDEIE (Quebec) PSR-SIIRI international collabo- 
ration grant and thanks the Centre for Mathematical Sciences 
at the University of Cambridge for hosting him for a visit. 

Appendix 

Lemma 13: For a fixed state p, the quantum mutual infor- 
mation is convex in the channel operation: 

I(A;B) U < Yv{x)I{A-B\ , 
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where 



oj ab := (id®M A '^ B ){i> p AA .), 



AA' 



(44) 



Proof: It is possible to show that 

I (A; B) u =H(p) + H(Af (p)) - H ((/ ® A/") (V)) , 
/ (A; B)^ =H(p) + H (M x (p)) — H {{I ® M x ) ty)) , 

and the desired inequality becomes 

H(p) + H(Sf(p))-H((I®Af) WO) 
< ^> (*) [ff (p) + ff (A& (p)) - if ((/ ® AQ W))] . 

This inequality is equivalent to 

H{X{p))-H{{I®M) (VO) 

< J] p (x) [if (A/" x (p)) — H ((I ® Af x ) m\ , 

x 

which in turn is equivalent to convexity of coherent informa- 
tion, or equivalently, the quantum data processing inequality 
for coherent information: 

I(A)B) <I(A)BX). 

■ 

Lemma 14: The quantum rate-distortion function Rl^ c (D) 
is non-increasing and convex: 

D 1 <D 2 ^R« ac (D 1 )>R« ac (D 2 ), 

C(AD 1 + (1-A)D 2 ) 

<\RS ac (D 1 ) + (l-\)B^ c (D 2 ), 

where < A < 1. 

Proof: The proof is similar to Barnum's [4], which in 
turn is similar to the one from Ref. l2T1l . Rl& c (D) is non- 
increasing because the domain of minimization becomes larger 
after increasing D, which implies that the rate-distortion 
function can only become smaller. Let (Ri,Di) and (R 2l D 2 ) 
be two points on the information rate-distortion curve and 
let Si and £ 2 be the respective operations that achieve the 
minimum in the definition of i^eac, respectively. Consider the 
map £\ = X£i + (1 — X)£ 2 . Under the assumption of a 
distortion function that is linear in the operation (such as the 
entanglement fidelity), it follows that the distortion caused by 
S x is D x = XDi + (1 — A) D 2 . We also have that R^ c (D x ) 
is the minimum over all operations that have distortion D\ 



so that i^ac (Dx) <I (A; B) u where 



,AB 
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A'^B(„lAA' 



(ip ). 



Finally, we have that the mutual information is convex in the 
operation (see Lemma 13 ) so that / (A; B) u < XRq^ (Di) + 

(i-X)RLc(D 2 ). ^ m 

Lemma 15 (Superadditivity of mutual information): The 
mutual information is superadditive in the sense that 

I (i?!i? 2 ; B t B 2 ) > I (R 1 ;B 1 ) + I (R 2 ; B 2 ) , 



where the entropies are with respect to the following state: 

Rl R 2Bl B 2 =^ AlA2 ^ BlB2 (^R lAl ®^ 2 A 2 ), 

with TV^i^^BiBs some noisy channel, and <j)R 1 A 1 and 
( Pr 2 a 2 being pure, bipartite states. 

Proof: The inequality is equivalent to 

H(R 1 R 2 )^I(R 1 R^B^) 

> H (R ± ) + / (R 1 )B 1 ) + H (R 2 ) + / (R 2 )B 2 ) . 

Observing that H (R\R 2 ) = H (Ri) + H (R 2 ) because the 
state on Ri and R 2 is product, the inequality is equivalent to 

I (R 1 R 2 )B 1 B 2 ) > I (R 1 )B 1 ) + I (R 2 )B 2 ) , 

which is in turn equivalent to 

I(R 1 B 1] R2B 2 )>I(B 1] B 2 ). 

This last inequality follows from the quantum data processing 
inequality. ■ 
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